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Summary 



H 

r-| ■ Best linear unbiased prediction is well known for its wide range of applications including 

-4— > ■ 

^ ' small area estimation. While the theory is well established for mixed linear models and 

under normality of the error and mixing distributions, the literature is sparse for nonlinear 
mixed models under nonnormality of the error or of the mixing distributions. This article 
develops a resampling based unified approach for predicting mixed effects under a general- 
^ ■ ized mixed model set up. Second order accurate nonnegative estimators of mean squared 

\^ ! prediction errors are also developed. Given the parametric model, the proposed methodology 

o ■ 

automatically produces estimates of the small area parameters and their MSPEs, without 
c5 . requiring explicit analytical expressions for the MSPE. 



Some key words: Best predictor; Bootstrap; Kernel; Mean squared prediction error. 



d • 1 Introduction 



Small area estimation (SAE) is an important statistical research area due to its growing 
demand from public and private agencies. The variance of a small area estimator based on 
the direct small area sample is unduly large and hence, there is a need for constructing model 
based estimators with low mean squared prediction error (MSPE). A good account of small 
area estimation research is available in a recent book by J.N.K. Rao (Rao, 2003). Although, 
in theory, it is possible to use such a model based approach, in practice a statistician often 
faces some challenging problems in implementing it due to the fact that for each model, 
estimators must be derived and their properties studied. Indeed, a small deviation from the 
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standard model assumptions may require a considerable amount of analytical work and need 
special expertise. For example, Prasad and Rao (1990) (hereafter referred to as PR) derived 
small area estimation formulas assuming normality of both the sampling distribution and 
the population distribution (for two-level small area models, discussed later) and with the 
moment based estimators of model parameters. After about a decade, Datta and Lahiri 
(2000) extended this approach when the model parameters are estimated by the maximum 
likelihood approach. Recent works of Jiang, Lahiri and Wan (2002) and Lahiri and Maiti 
(2003) (hereafter referred to as JLW and LM, respectively) allow a more general framework, 
but both works require the knowledge of the exact functional forms of the MSPE, which are 
known only in few simple cases. However, a general solution to finding the best estimator of 
the small area parameters or of its functions, and estimation of the associated MSPE are not 
available. A second problem with the existing approaches (except for the LM method) to 
estimating the MSPE is that these methods do not always produce non-negative estimates. 
Though the linearization technique of PR produces non-negative estimates under normality, 
the jackknife method may produce negative MSPE estimates (Bell, 2002). Consequently, 
there is a great demand for a general estimation system where the user can only specify the 
distributions and then valid estimates of the small area parameters and their MSPEs can be 
obtained without much of analytical efforts. 

In this paper, we consider a general two level aggregate data model and develop a unified 
system for prediction of small area parameters and estimation of the associated MSPE. 
Here we extend the "perturbation" or "tilting" method of LM and construct a nonnegative 
estimator of the MSPE that achieves second order accuracy for bias correction without 
requiring explicit analytical derivation of the MSPE function. The key idea is to combine 
the LM approach with the parametric version of the bootstrap method of Efron (1979) so 
that accurate numerical approximations to various intermediate population quantities can 
be generated numerically. We show that under some regularity conditions, the proposed 
MSPE estimator attains second order accuracy for a wide range of parametric distributions 
and for a general class of model parameter estimates and their nonlinear functions, without 
requiring the user to derive the formulas on a case by case basis. 



The rest of the paper is organized as follows. In Section 2, we consider the general two 
level aggregate data model that is commonly used in the context of small area estimation. In 
Section 3, we describe the (estimated) best predictor of functions of the small area parame- 
ters. In Section 4, we briefly describe the existing approaches to MSPE estimation and also 
give a description of the proposed method. Theoretical properties of the proposed method 
are given in Section 5. Results from a simulation study and some concluding remarks are 
presented in Sections 6 and 7, respectively. Proofs are given in the Appendix. 

2 Generalized Mixed Models for Small Area Estimation 

Consider the general two level aggregate data model 

|/,|^/^"Fi(-;^„i?,), ^. ~'F2(-;x„A,G,), ^ = l,---,m, (2.1) 

where, Ri and Gi are known functions of a vector of p-parameters ip = {ipi, . . . ,ipq), say, 
{Ri,Gi) = gi{4'). Thus, the model is determined by the parameter vector 5 = {\ ,^ )^, a 
(p + q) xl vector of constants. Usually, t/j's are direct survey estimators with sampling vari- 
ance Ri, ^j's are small area parameters and Xi a set of covariates available at the estimation 
stage. Aggregate and generalized linear mixed effects models are special cases of (2.1). 
Consider the Fay-Herriot (1979) type small area model 

yi = ei + a, Oi = xJX + Vi (2.2) 

where e^'s are independent A^(0, Sj) with known Sj, f j's are iid A^(0, a^) and Cj and f j's are 
independent. Furthermore, Xj is a known p x 1 vector of co-variates, A is the vector of 
regression coefficients; yi is the direct survey estimator of 6i. Note that (2.2) can be written 
as yi = xJX + Vi + Ci which is a special case of a linear mixed model where both Fi and F2 
are normal cdf. 

Next consider the mixed logistic model, where conditional on small area parameter pi, 
the direct estimator yi is binomial {ni,pi),i = 1, ... ,171; here n, is the number of sampled 
units in the i-th small area. Then, consider the model 

9i = logit{pi) = xJX + Vi, (2.3) 



where the f j's are iid A^(0, o"^). In this case Fi is binomial and F2 is normal in the logit scale. 
This is a special case of generalized linear mixed model. 

Our objective is to make inference about a function of the small area parameter 6i 

Pi = h{ei),t=l,...,m, (2.4) 

where /i is a suitable function chosen by the user. For example, the "Small Area Income and 
Poverty Estimation" (SAIPE) project of the US Census Bureau uses the log value of the 
direct estimates for estimating poverty at the county level and thus an inverse transformation 
needed for the parameter of interest. We would like to emphasize that, at the second level 
of modeling, the structure always need not be of the form h{6i) = xJX + f j. In fact, we can 
also use nonlinear modeling, such as h{6i) = K,{xi; A, fj). where k is a nonlinear function. 

3 Development of the Best and Empirical Best Predictors 

As an estimator of the small area parameter, we will take the best predictor (BP) as defined 
below. Let (3i = h{9i) be the parameter of interest. We define the BP and the empirical best 
predictor (EBP) of /9j, respectively, by 

A = E^{h{e,)\y}, (3.1) 

A = E'^{h{e,)\y}, l,...,m, (3.2) 

where 5 is an estimator of 5. For example, in the Fay-Herriot model (2.2), the BP of 
h{9i) = 6i takes the form /3j = xJX + —{Vi — xJX), where Ti = a^ + Sj. For a general 
h{-), however, a closed form simple expression for the BP/EBP and their MSPE may not be 
available. Consequently, the PR-type SAE methodology based on Taylor's expansions may 
not be readily applicable. 

Next, we derive some useful general formulas for the EBP of (3.2). Note that by the 
independence of t/j's, the conditional distribution of 6i given t/i,...,|/„ depends only on 
Hi (and 6). Hence, Pi = E§{h{9i)\y} = j h{t)Fe^\yXdt]6) = ^i{yi;S) say, where Fe^\yX-]S) 
denotes the conditional distribution of 6i given yi. The EBP is given by 

^ = Uy^■,^s). (3.3) 



First consider the case where the marginal distribution of 9i has a probabihty density function 
(pdf) /2(-; Xj, A, Gi) (with respect to the Lebesgue measure) and the conditional distribution 
Fi{-;6i,Ri) of yi given 6i has a generalized density fi{-;6i,Ri) (i.e., the Radon-Nikodym 
derivative with respect to a o"-finite measure). For example, /i can itself be a pdf or a 
probability mass function (pmf) for a discrete probability distribution. In this case, the 
EBP is given by 

j Pi{yi,t;d)dt 
where pi{y,t;6) = fi{y;t,Ri)f2(t;Xi,\,Gi). Next consider the case where the marginal 
distribution of 9i is discrete and has a pmf /2(-; Xj, A, Gi) and Fi(-; 9i, Ri) has a generalized 
density fi{-;6i,Ri) as above. Here the EBP is given by 

EtPi{yi,t;d) 

where pi{y,t;6) is as before and where the sum in (3.5) runs over all t in the support 
of 6i. In many applications, formulas (3.4) and (3.5) can be implemented using numer- 
ical methods, e.g., numerical integration, MCMC, importance sampling, etc. For exam- 
ple, for the logit-normal model with the canonical link, C,i{yi',S) = [Jai{t){yi + 1){1 + 
ai{t)}^'^'^^(/){t)dt]/[ J ai{t)yi{l + ai{t)}^'''''(f){t)dt], where ai{z) = exp{xj X + a^z) and is the 
N(0,1) pdf (e.g., see, McCulloch and Searle (2001, pp 273) and JLW). In this case, the EBP 
can be easily evaluated by generating N(0,1) variates and using the Monte-Carlo method. 
Remark 1: (Parameter estimation) . In general, the maximal likelihood estimates (MLE's) 
do not have any closed form expressions. Except for the conjugate and linear link models, 
maximization of the marginal likelihood involves integration with respect to the distribution 
function F2. There is no unique way of evaluating this integral. Using advanced techniques 
such as EM based MLE, Markov Chain Monte Carlo (MCMC) based MLE, etc., the MLE's 
can be obtained for a large class of distributions. An excellent account of guidelines for the 
general mixed linear models can be obtained in Chapter 10 of McCulloch and Searle (2001). 
We mention that the SAE methodology developed here is equally applicable for other type 
of parameter estimators such as those based on method of moments or estimating equation 
approaches, provided they are m^^^ consistent. 



Remark 2: For situations where a direct implementation of (3.4) or (3.5) is difficult, we 
now describe some approximations to the EBP using the bootstrap method of Efron (1979) 
and the nonparametric functional estimation methodology. Note that /3j is the conditional 
expected value of a function of 6i for fixed 5 evaluated aX 5 = 5. This suggests that under 
mild regularity conditions, we may approximate f3i to any desired level of accuracy by using 
standard regression function estimation methods, such as Nadaraya- Watson estimators, local 
polynomial estimators, etc. Let {y*"', 9*''}j^i be generated values using model (2.1), but with 
6 = 6. When the distributions of 9i and i/i are continuous, we propose a Nadaraya- Watson 
approximation to Pi, given by 

where A;(.) is a symmetric kernel function chosen suitably. There are many choices of /?(.), 
such as a Gaussian kernel k(.){x) = \k{x/h) where h is the bandwidth and k{x) = 0(x), the 
standard normal density function. On the other hand, when the marginal distribution of i/i 
is discrete, we propose 






p: = -^=\'' ' ^y'=y\ (3.7) 



where /[.] denotes the indicator function. Results on Nadaraya- Watson estimators of regres- 
sion functions imply (cf. Hardle (1991)) that 



\p^ - p*\^ = 0[{Jb)-' + r'j in probability, (3.8) 

as J ^ oo and 6 ^ in such a way that Jb -^ oo. The bound in (3.8) is available uniformly 
over i = l,...,m, provided there exists a constant C G (0, oo) such that E^\^^ 0^i]t)\ + 
E^Iqj^ {Yf, t)\ < C for alH = 1, . . . , -m and for all t G A/", a neighborhood of the true value of 
the unknown parameter 6. Here, ^"(i/;t) = ^^i{y;t), g"{y;t) = ^gi{y;t), and gi{y;t) is 
the marginal density of Yi. For the discrete case, a direct computation shows that 

\p- - /3*|2 = 0(^.r^) in probability, (3.9) 

as J ^ oo. This bound is also available uniformly in i, provided E^\h{6i)\'^ < C for all i and 
for all t G A/", where C G (0, oo) is a constant, and N is as above. 



Thus, for both the discrete and the continuous data, the accuracy of the approximation 
P* to Pi increases with larger values of J. For the continuous case, we need to specify a 
choice of the bandwidth b. For kernels arising from symmetric probability densities, the 
optimal choice of b is of the order J~^/^. We take the bandwidth b of this optimal order, 
e.g., b = J~^/^, and attain a desired level of accuracy by choosing J suitably large. Finite 
sample accuracy of the approximations (3.6) and (3.7) are typically very good. See Table 1 
in Section 6 below which reports the relative biases and MSPE'S of (3.6) and (3.7) for the 
normal-normal and the logit-normal examples. 

4 Mean Squared Prediction Error and its Estimation 

4.1 Background 

As a measure of accuracy of the EBP Pi, we shall consider the Mean Squared Prediction 
Error(MSPE) of A, MSPE{pi) = E^{pi - pif = Mi{6). It is easy to show that 

M,{6) = E^iP, - P,f + E^{k - hf = Mu{6) + M2,(5), say. (4.1) 

The first term Mii{5) is the mean squared error of the (ideal) best predictor Pi while the 
second term M2i{5) accounts for the extra variability due to the estimation of 5. Typically, 

Mii{5) = 0(1) and M2i((5) = 0{m-^) as m ^ oo. (4.2) 

It is tempting to plug in 6 in (4.2) and get a simple MSPE estimate as 

mspesiM(A) = Mui6) + M2^{5). (4.3) 

However, this approach has two drawbacks. First, explicit expressions for the functions 
Mii{5) and M2i{5) are not always available. In the very special case of the normal- normal 
Fay-Herriot model, an expression for Mii{5) and an approximation for M2i{5) are available 
for h{9i) = 6i, i = 1, ... ,771. Even for this model, expressions are not available for a nonlinear 
function of 6i and one has to derive those. For example, Slud and Maiti (2006) derived the 
expressions for MSPE estimates under normal set up when h is an exponential function. 



The second problem with the above approach is a httle more subtle. To describe it, 
note that typically, the estimator 6 has bias and variance of order 0{m^^), which propagate 
through the simple MSPE estimator, leading to E{MiiCS)} = Mii{6)+0{m-^) and E{M2ii5)} 
M2i{S) + o{m~^) as m -^ oo. (Here and in the following, we often drop the subscript 6 to 
ease notation). Thus, E{Mii{6) — Mii{6)}, the bias of the simple estimator of Mii{6), is of 
the order 0{m~^) which masks the contribution of M2i{-) to the MSPE of f3i (cf. (4.2)). 

In view of the second problem, in the SAE literature, it is customary to require that the 
bias of a "good" estimator of MSPE(/3j) be of smaller order than 0{m^^). Traditionally, the 
bias of the naive estimator Mii{6) is reduced by explicit bias correction, either by using a 
Taylor's expansion of the function Mij(-) (cf. PR) or using the Jackknife method (cf. JLW). 
Other related work include Pfeffermann and Tiller (2005) and Pfeffermann and Glickman 
(2004). The first paper approximated M2j(.) and the bias of Mii{6) under a state space 
model based on parametric bootstrap, assuming normality of the errors. The second used 
a bias corrected estimator of Mij(.) and a parametric bootstrap estimator of M2i{.), for the 
Fay-Herriot model. Pfeffermann and Glickman also developed a 'nonparametric' bootstrap 
method that did not require generating samples from a distribution. Nonetheless, normality 
was still assumed implicitly. In a recent work, LM proposed a new approach to bias correction 
that attains second order accuracy and at the same time, produces a nonnegative estimator 
of the MSPE. Here, we extend the LM approach to the case of estimating the MSPE of 
a general function of 6i with second order accuracy under a general two-level parametric 
model, even when exact expressions for the functions Mij(-) and M2i{-) are not available. 

For completeness, we now briefiy describe the LM method. Suppose that for i = 1, ■ ■ ■ , m, 

k 

j:\Mii>{6)\>eo, (4.4) 

i=i 
for some eo > 0, where for a smooth function / : iR^ -^ ]R, f^^\ /(■'''") and /(-J'^'*) denote the 

first, the second and the third order partial derivatives with respect to the j-th co-ordinate, 

the (j, r)-th co-ordinates, and the (j, r, s)-th co-ordinates, respectively, j.,r^s = l,---,k, 

where k is the number of model parameters. Condition (4.4) says that M{1 (5) ^ for 

some j. For notational simplicity, we suppose that M^^ (5) ^ 0. Then, the preliminary 



perturbed estimator of 5 for the i-th small area is defined as 5i = 6 — -BjjMfj (5)} ^ei where 
B^ = EU^^mj) + lEU^r=iMir\6)Vij,r), with b = (6(1),..., 6(A;) and V = 
{{V{j,r)))kxk respectively denoting some suitable estimators (e.g., bootstrap estimators) of 
the bias and the variance of 6, and e^ G M'' has 1 in the r-th position and zeros elsewhere, 
1 < r < k. The LM estimator of the MSPE is now defined as 

mspei^M0i) = Mu(5i) + M2i(5), i = 1, ■ ■ ■ , m, (4.5) 

where 5i is the perturbed estimator of 5 for the i-th small area, defined by 

. \~5, if 5, G A and |Mi')(5)|-i < (1 + logm)^ 
Oi= { ^ (4.6) 

I 6 otherwise, 

and A is the set of possible values of the parameter 5 under model (2.1). Note that by 
construction, the MSPE estimator is always nonnegative. Further, LM show that under 
some regularity conditions, the bias of the estimator mspei^y[{f3i) is of the order o{m^^). 
Remark 3: When more than one partial derivatives M{1 [5) are non-zero, one may use 
perturbations along all such directions. Thus, an alternative MSPE estimator is given by 

"^^P^LM:ALt(A) =^ii(^i) +^2i(^i), i = l,---,m, (4.7) 



where 



i] 



S\ if S\ e A and I J|-i Ejej Im!?!^)!'" < (1 + logm)' 
5 otherwise, 



5\ = 5- Y.,eJ[B^/M^^(5)]e,/\Jl J = {j ■.l<3 < k.M^^S) ^ 0} and for any set A, 
let \A\ denotes its size. The arguments developed in LM readily imply that the new MSPE 
estimator is also second order correct, under the same set of regularity conditions as in LM. 
By combining all \J'\ directions, the new estimator attains a better finite sample stability. 

4.2 Nonnegative estimation of the MSPE when expressions for Mu and M2i are Unavailable 

As discussed earlier, except for very few standard models, exact or closed form expressions for 
the terms Mij(-) and M2i{-) are not available. Here we employ the Bootstrap method of Efron 
(1979) to develop an approximated version of the estimator mspeLM that is nonnegative, 
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second order accurate, and that can be computed without additional analytical work on the 
part of the user. To that end, first we define a bootstrap based approximation to the function 
Mij(-) at a given value Sq- Let {yf, 6f), / = 1, . . . , A^o be iid random vectors generated using 
model (2.1) with 6 = 6o. Then the bootstrap approximation to Mii{6o) is given by 

No 



1 " r 1 2 



Next we use M*j(-) to construct estimators of the partial derivatives of the function Mij(-). 
To motivate the construction, consider a smooth function f : M ^ Ml. Then, for any a E M, 

f{a + 6) - /(a - e) = {/(a + e) - /(a)} - {/(a - e) - /(a)} = 2e/'(a) + o(e) 

as e -^ 0, where /'(a) denotes the derivative of /(•) at a. Hence the scaled difference 
(2e)^^{/(a + e) — /(a — e)} gives an approximation to f'{a) for small values of e. We now 
employ this fact to define suitable approximations to the first order partial derivatives of 
Mii(-) at 6. Let {zm} be a sequence of positive real numbers converging to zero. Let 

MitCS) = 7^{m*uC6 + zme,) - MlX5 - z^e,)], (4.9) 

j = 1, . . . ,k. Using a similar reasoning, we also define approximations to the second order 
partial derivatives as 

mJP(5) = ^[Ml^(5 + Zme,) + M*^i5-Zrr.e,)-2Ml^i5)}, I < J < K (4.10) 

-zj{Mif'>{6) + Mil'^>{6)}], l<jj^r<k, (4.11) 

where Cj^r = Gj + Gr- Theorem 1 in Section 5 shows that under some regularity conditions, 
maxi<,<„E|M£>(5) - Mg)(5)| = 0{zm + {zmV^N^'''^^^''^). and maxi<,<„ E|Mg'")*(5) - 
Mjf"^(5)| = 0(2„ + (2„)-2iVo"^/(i+'')). for all 1 < j,r < A;, for some t] G (0,1]. Thus, the 
proposed estimators of the partial derivatives provide accurate approximations for suitable 
choices of z^ and A^o- 

Next for / = l,...,A^Oj let (y*', . . . ,|/i^) be iid random vectors having joint distribu- 
tion (2.1) with 5 = 5 and let 5* denote the bootstrap version of 5, obtained by replacing 

10 



{yi, . . . , Hm) with (yf , . . . , yj^). Define the bootstrap estimators of the bias and the variance 
of 5 by 

b' = jri:^"-i and r = {-Er'(«r}-(-i:r')(-^i:r') , (4.12) 

respectively. Theorem 2 in Section 5 below gives conditions for the consistency of h* and V* . 
With this, we now define the bootstrap based preliminary perturbed estimator 5^ as 

5l = 5- B*{M[f*{5)Y\s, provided \M[f{5)\ ^ for some s = s^ G {1, . . . , A;}, 

where B* = Ej=i M^f *(5)6*(j) +2"^ E']=i Er=i Mif''^*{6)V*{j, r), x(j) denotes the jth com- 
ponent of a vector x and B{j,r) denotes the (j, r)th element of a matrix B. The bootstrap 
based perturbed estimator of 6 for the ith small area is now defined as 

is* if^:GAand|M(f(5)|-i<(l + logm)2 
(^i = S . (4-13) 

I 6 otherwise 

and the bias corrected estimator of Mii{6) is given by Ml^{6*)., i = 1,- ■ ■ ,m. 

Next we define the bootstrap estimator of M2i{S). Note that M2i{S) = E^{Pi — PiY = 
E^{^i{yi, 5) — ii{yi, 5)} . Let 5*\ I = 1, . . . , Nq denote iid bootstrap replicates of 6 as above 
(cf. (4.12)). Then, the parametric bootstrap estimator of M2i{S) is now defined as 

No 

Mm = N,-' Y: {e.(yf, S*^) - 6(1/;', S)}'. (4.14) 

1=1 

Pefferemann and Tiller (2005), Pfeffermann and Glickman (2004) and Butar and Lahiri 
(2003) also proposed similar parametric bootstrap estimates of M2j(.) for normal errors. 
The proposed bias corrected estimator oi the MSPE Mi{6) is defined as 

mspeNEw(A) = MUm + Mm, (4.15) 

i = 1, ■ ■ ■ ,m. In the next section, we show that under some regularity conditions, the pro- 
posed estimator has a bias that is of the order o{m^^). As a result, the proposed estimator 
attains the same level of asymptotic bias accuracy as the previously proposed MSPE estima- 
tors. Furthermore, as (4.15) does not require explicit expressions for the functions Mu and 
M2i, the proposed MSPE estimation methodology can be applied to complex or nonstandard 
models where none of the existing methods are easily applicable. 
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O Theoretical Properties 

For investigating the theoretical properties of the proposed method, we shall suppose that 
the random variables {yi,Oi) : i = l,...,'m and the various bootstrap variables {yf,6f)^s 
are defined on a common probability space. We write Px and Ex to denote the probability 
and the expectation under a given parameter value a; G A. For notational simplicity, we set 
P^ = P and E^ = E where 6 is the true value of the parameter. Define the functions a(-) 
and S(-) by b{6) = a{5)/m and V{5) = T,{6)/m, where b{6) = E^{6)~6 and V{6) = var^{6). 
Note that a, b, S and V depend on m. Unless otherwise specified, limits in the order symbols 
below are taken asm ^ oo. Also, let E r- denote the conditional expectation of the bootstrap 
variables given 6. Proofs of the main results are given in the Appendix. 

Conditions 

(C.l) (i) 6, the true value of the parameter, is an interior point of A. 

(ii) Mii is three times continuously differentiable on A and there exists a constant 
Ci G (0, oo) such that for all x G A,j,r,s = l,---,k and i = 1,- ■ ■ ,m,m > 1, 
\Mip(x)\ + \Mil''\x)\ + \Mif'-''\x)\ < Ci. 

(iii) M2i is differentiable on A and there exist constants (72,6*3, cq G (0, oo) and 
7 G (0, 1] and a function Gi : IR^ -^ [0, oo) with EGi{6) = 0(1) such that for all j = 
l,---,k;i = l,---,m,m>l, \Mif{6)\ < C2m~\\M2i{x)\ < m-'^Gi{x) for all xe 
A and m\M^\x) - M^{^{6)\ < CaWx - 5p for all xeM, where Af = {\\x - 5\\ < cq}. 

(C.2) There exist constants r] G (0, 1] and C4 = C^iv) G (0, 00) such that E|/i(^i)|2+2^ < C4 
for all i = 1 , ■ ■ ■ , m, m > 1 . 

(C.3) (i) Let pm{x; a) = Ex\\S — x\\"-, X E A, a E {0,00). Suppose that there exists a constant 
7] G (0, 1] such that Ep^{6; 2 + 2r]) = 0(1). 

(ii) The sequences of functions {a} = {dm} and {S} = {Em} are (component-wise) 
equicontinuous at 6. 

(iii) There exists a continuous function G2 : M^ -^ [0, 00) such that ||am(a;)|| + ||Sm|| < 
02(3;) for all a; G A and EG2i6f = 0(1). 
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(C.4) There exists a constant i] E (0, 1] such that El ^/m\S — S\\> = 0(1). 

We now briefly comment on the regularity conditions. Condition (CI) requires the func- 
tions Mii and M2i to be smooth, which typically holds under suitable smoothness conditions 
on the parametric model (2.1). As mentioned earlier, in most applications the function Mu 
is of the order 0(1) while M2i is of the order 0{m^^) as m ^ oo. Condition (C.l) requires 
that the partial derivatives of these functions also have the same orders. Conditions (C.2), 
(C.3)(i), and (C.4) are moment conditions depending on t], whose values will be specified 
in the statements of the results below. These are used to prove 'closeness' of various para- 
metric bootstrap estimates to their conditional expectations. Condition (C.3)(ii) and (iii) 
are exclusively used to establish consistency of the bootstrap estimators of the bias and the 
variance estimators of 6. 

The first result proves consistency of the partial derivative estimates. 



Theorem 1: Let Conditions (C.l)(ii) and (C.2) hold and let A^o be as in (4.8). Then 

max max E m[{'>*{6)-M[{\6) = 0(zm + [zmr'^No~], (5.1) 

l<j<k l<i<m V / 

Mir^*{5)-Mif'\5)\ = 0(^™ + [^™]-X~^). (5.2) 



l<:j,r<k l<i<m 

Note that the right sides of (5.1) and (5.2) go to zero for any 2;^ — ^ 0, Aq ^ oo such that 
Zm^N^ ^ -^ OO. Here z^ acts as a smoothing parameter that controls the bias parts of 
the proposed estimates. For a smaller value oi Zm, a larger value of the resample size Aq has 
to be chosen accordingly to attain a desired accuracy level. Also, note that the value of Aq 
required for estimating the second order partial derivatives must grow at a faster rate than 
the case of the first order partial derivatives to attain the same level of accuracy. 

The next result considers accuracy of the bootstrap bias and variance estimators of 6. 

Theorem 2: Let Condition (C.3) hold and let A'q be as in (4.8). Then, 
E\\b* -bf = 0{Nq^) and E\\V* - V\\ = 0{No^). 



Under the conditions of Theorem 2, the bootstrap bias estimator is A'o^ consistent. The 
variance estimator can also attain the same rate, provided rj = 1. Note that unlike Theorem 
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1, the estimators of the bias and the variance matrix of 6 do not involve a smoothing 
parameter hke Zm- 

The next result shows that under suitable conditions, the proposed estimator of the 
MSPE(/3j) second order bias corrected. 

Theorem 3: Suppose that conditions (C.1)-(C.4) hold and that rj = 1 in both (C.2) and 
(C.3). Suppose that for each i = 1, . . . ,m, there exists s = s{i) G {1, . . . , /c} such that 

\M[f{6)\>Co (5.3) 

for all m > 1, where Cq E (0, oo). Let Zm = m~^^* and A^o > ""^^ for some a > 9/2. Then 
the proposed mspe estimator is second order bias accurate, i.e.. 



max 

l<i<m. 



E 



mspeNEw(A) - Mi{6) = o{m ^). (5.4) 



Theorem 3 shows that the proposed MSPE estimator achieves the same second order bias 
accuracy as the earlier methods proposed in the literature. Thus, under the given regularity 
conditions, the additional randomness induced by several resampling steps has a negligible 
effect on the bias of the new estimator. Since it also does not require the knowledge of the 
functions Mij(-), M2j(-), of their the partial derivatives, and of the bias and variance of the 
estimator 5, the proposed method can be applied to any model of the form (2.1), where the 
other methods are not readily applicable. The price paid for this omnibus solution is that it 
is computationally intensive. 

6 Practical Implementation and Numerical Findings 

6.1 Finite sample considerations 

In this section, we provide some guidelines for implementing the proposed MSPE estimation 
methodology in finite sample applications. Supposing, for the time being, that an expression 
for the BP is known, computation of different parts of the estimator mspcNEW involves 
generating (parametric) bootstrap samples from the joint distribution of (|/j, 9i) for i = 
1, ... ,171 (cf. (2.1)) at various values of the parameter 6. For the bootstrap bias and variance 
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estimators b and V and the term M|j(5), we suggest using a resample size (drawn from 
(2.1) with 6 = 6) in the 100s (e.g., in the range 500-1000). This is known to be adequate 
for Monte-Carlo evaluation of bootstrap estimators of variance-type functionals (cf. Efron 
and Tibshirani (1993)). Next consider numerical evaluation of the first term of mspcNEW) 
i.e., of M^-{6^). This requires us to approximate the partial derivatives of Mij(-) which, in 
turn, involve the smoothing parameter Zm- For all computations done in this section, we 
set Zm = m~^/'^ as in Theorem 3, although other choices of Zm ^ rn^^ may be used. For 
the numerical approximation of the partial derivatives, the resample sizes must be larger in 
order to compensate for the effect of the smoothing - the smaller the choice of the smoothing 
parameter Zm-, the larger the choice of A'^o will have to be. For m of moderate size (e.g., 
m G (10,80)) and Zm as above, we have found resamples of size A'"o in the range 2000- 
10,000 adequate for computing the first order partial derivatives M*j and resamples of size 
A^o ~ 10) 000 for the second order ones Mj^j . Finally, in the case that an exact expression 
for the EBP is not available and it is approximated numerically using (3.6) or (3.7), the 
resample size J may be chosen in the 100s (e.g., 300-1000) in the discrete case while it 
must be of a higher order (e.g., 1000-I-) in the continuous case. Approximations given by the 
above choices of the resample sizes are generally very good. In the next section, we report the 
results of a simulation study and the associated computing time for three specific examples 
where we follow the finite sample guidelines given above. For an illustration. Table 1 below 
gives the resulting approximations for the EBP both in the discrete and the continuous cases 
which appear to be in good agreement with the true values. 

6.2 Simulation results 

In this section, we check the performance of the MSPE estimators (4.5) and (4.15) for Models 
I-III described below, and compare them with the Datta-Lahiri (2000) (hereafter, referred 
to as DL) version of the PR method and the jackknife method of JLW, as described in Rao 
(2003). DL extended the PR method when the model parameters are estimated using MLEs. 
We used MLEs of the model parameters for Models I and II, and used estimating equations 
for Model III. Normal kernel was used for the kernel based EBP's. We use the following 
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notations for different methods of MSPE estimation: JK for jackknife, LMl for (4.5) and 
LM2 for (4.15). 

Model I: Normal-Normal. This is a continuous data model, where both Fi and F2 are 
normal; The model structure is specified by (2.2) with A = 0. In this setting, all four methods 
of bias correction are applicable. Although in this case a closed form expression for the BP 
exists, to gain some insight into the performance of the suggested approximations, we use 
(3.6) to find the BP for the LM2 method. For the other three methods, the available closed 
form expressions are used. We choose F2 to be normal with mean and variance unity, and 
Fi to be normal with mean and variance Si,i = 1, ■ ■ ■ ,m. with m=15. The 15 areas are 
divided into three groups of five, with equal numbers of areas and equal values of Si. The 
three different values of Sj used are (.7, .5, .3). The set-up is similar to the one considered by 
Datta et al. (2005). 

Model II: Binomial-'Logit-Normal'. This is a binary data model where we suppose 
that -Fi is binomial and F2 is logit-normal. In particular, the logit of the success probability 
of Fi is normally distributed with mean zero and variance unity (cf. (2.4) with A = 0). In 
this setting, only JK and LM2 methods of MSPE estimation are applicable. The binomial 
population has 8 areas, of respective sizes nj=36, 20, 19, 16, 17, 11, 5 and 6, based on the 
number of patients receiving a particular treatment from different clinics (Booth and Robert, 
1998). To generate the ith binomial population, we first generate the success probability 

P. = , r'^:' , (6.1) 

1 + exp(/i + t;j) 
where Vi is a standard normal variate, i = 1, ■ ■ ■ , 8 and /i = 0. In this case the BP is not 
available in a closed form. We first find the maximum likelihood estimates of the model 
parameters using Slud (2000). Then the BP is calculated using Gauss-Hermite quadrature 
with 15 points for the JK method and (3.7) for the LM2 method. 

Model III: Normal-Lognormal. This is a continuous data non-conjugate model, where 
Fi is normal and F2 is lognormal. You and Rao (2002) considered this model for estimating 
the Canadian census undercoverage and called this as 'unmatched sampling and linking 
model'. Here, neither the PR/DL nor the JK methods are applicable in a straightforward 
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way. We took m = 15 and generated ^j's {i = 1,- ■ ■ ,m) from a lognormal distribution. We 
took two covariates, besides the intercept, one was generated from A^(0, .5) and the other was 
generated from Uniform (.5, 1). We set 6 = {\ , al)'^ = (0, 0.5, —1.5, 0.5)"'". Then given ^j's, 
l/j's were generated as in Model I. Instead of using ML estimate, we used unbiased estimating 
equation approach for estimating the model parameters (cf. Ghosh and Maiti, 2004). Since 
the BP does not have any closed form expression, we used the kernel based estimator (3.6) 
for estimating the BP and consequently, of the four, here LM2 is the only method available 
for estimating the MSPE. Also, note that in this case, one can obtain the perturbed estimator 
of 6 either by (4.15) or by the (estimated version of the) method described in Remark 3. 
Both methods gave very similar results. The MSPE estimator in Remark 3 (with estimated 
partial derivatives, etc.) gave slightly low CV than the estimator in (4.15); see Table 2. 

In implementing LM2, we used 1000 bootstrap samples for finding the bias and variances 
estimates and 10000 bootstrap samples for all other approximations. All simulation results 
were based on R=1000 replication. The approximate computation time for each model is at 
most 48 hours on a UNIX workstation equipped with 4000MHz 64-bit CPU and FORTRAN 
77 compiler. In any real application, user needs to run the code only once, meaning minimal 
computational time (less than 3 minutes) with data sets of a similar size. 

To study the performance of the EBP 9i of the small area parameter 6i, we use the 
following two empirical measures. 

1 R 0('-) _ 0('-) 
Absolute relative bias Ti = — T^ 1^^ — t-t^ — |. (6.2) 

1 ^ 

Empirical MSPE T2 = -Y.i^f^-^t^f- (6-3) 

The body of all the tables gives averages over all the small areas where the "average" is 
measured in terms of the median (given in the first column for each model) or the mean (in 
the second column). 
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Table 1. Absolute relative bias (Ti) and empirical MSPE (T2) for the EBP. Results using 
the kernel based approximations (3.6) and (3.7) are reported within the parentheses. 



Model I 

Measures Median Mean 
Ti 2.318 4.171 



Model II 

Median Mean 
0.223 0.243 



Model III 
Median Mean 



(2.156) (4.122) (0.224) (0.243) (0.926) (0.923) 
T2 0.376 0.366 0.0107 0.0131 — — 

(0.378) (0.373) (0.0107) (0.0131) (0.269) (0.292) 



There is a good agreement between the actual values and the approximations for the 
EBP given in equations (3.6) and (3.7). For the binary data, this agreement is particularly 
remarkable. This is because for the same value of the resample size J, the approximation in 
the discrete case is more accurate (having a faster rate of convergence). In the case of the 
binary data, the "actual" values are found by numerical integration. The simulation result 
shows that both the numerical integration based approximation and the "kernel" method 
based approximation (3.6) behave similarly. However, kernel method seems more automated 
than numerical integration as it does not require additional programming for a different 
continuous data model. 

Table 2 reports the following empirical measures of relative bias and coefficient of varia- 
tion, quantifying the performances of different MSPE estimation methods: 



Relative bias T3 
Coefficient of variation T4 



[E{MSPE{e,)}-T2]/T2 
E{MSPE{ei) - T^Y] ^ /T2. 



(6.4) 
(6.5) 



Here E{MSPE{6i)} and E{MSPE{6i) — T2Y are estimated empirically by averaging the 
replicates of MSPE{9i) and {MSPE{()i) - T2Y, respectively. 



Table 2. Relative biases (T^) and coefficient of variations (T4) for the bias corrected 
estimators of the MSPE. Entries within parentheses represent LMl and LM2 estimates 

based on Remark 3 modification. 



PR/DL 



JK 



LMl 



LM2 





Model I 


Model II 


Model III 


asures 


> Median 


Mean 


Median 


Mean 


Median Mean 


T3 


-0.016 


-0.004 


— 


— 


— — 


T4 


0.159 


0.150 


— 


— 


— — 


^3 


0.068 


0.095 


-0.088 


-0.026 


— — 


T4 


0.504 


0.635 


0.686 


0.758 


— — 


T3 


-0.015 
(-0.000) 


-0.018 
(0.050) 








— — 


T4 


0.158 
(0.153) 


0.151 
(0.149) 


: 


: 


: : 


^3 


-0.013 


-0.028 


-0.108 


-0.083 


0.116 0.041 




(-0.019) 


(-0.024) 


(-0.087) 


(-0.083) 


(0.115) (0.044) 


T4 


0.229 


0.224 


0.172 


0.164 


0.319 0.368 




(0.225) 


(0.218) 


(0.170) 


(0.156) 


(0.310) (0.298) 



For Model I, all the methods perform well in terms of minimizing relative bias. However, 
in terms of the coefficient of variation, there is a difference in the performance of the four 
methods. The PR/DL and LMl methods turn out to be the best, followed by the LM2 
method. The small increase in the variation of the LM2 method over the LMl method is 
expected, as the randomness in the various approximation steps in its construction adds to 
the total variability of the bias corrected MSPE estimator. However, the highest variation for 
this model is observed for the JK method, where the variation more than double compared to 
the LM2 method and it is more than three times compared to the LMl and PR/DL methods. 

As mentioned earlier, for Model II, only the LM2 and the JK methods are applicable. In 
this case, the LM2 tends to have higher relative bias. However, in terms of the coefficient 
of variation, which gives the combined effects of the bias and the variance of the MSPE 
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estimators, the LM2 method again beats the JK method by a relative magnitude of 300% 
to 400% or more. To gain further insight into the bias properties of the two methods, we 
repeated the simulation study with m = 16 areas (instead of the m = 8 areas considered 
earlier) under Model II. For this higher value of m, we found that the relative bias for the 
LM2 method dropped to -.038 and -.024 for the median and the mean, respectively. The 
eight additional small area sizes were 37, 32, 19, 17, 12, 10, 9 and 7. In comparison, the 
relative bias for the JK method under m = 16 were -.025 and -.026 for the median and mean 
respectively. The coefficient of variations for the two methods continued to show a similar 
pattern as in the m = 8 case. Thus for both models, the estimators produced by the JK 
method have inferior performance in terms of the coefficient of variation. 

For Model III, the PR/DL method is not applicable and the existing literature does not 
show how to apply the JK method. This is a somewhat unusual set up of simulation within 
the existing SAE literature. It may be interesting to know that, if some one naively used 
Mi{S) with formula (4.8), the median relative bias would be -.227 and the mean, -.260. 
This indicates severe under-estimation which is expected. In comparison, LM2 produces 
satisfactory results for both the relative bias and the coefficient of variation. 

7 Discussion 

In this paper, we consider a new method of bias correction for the "simple" estimator of 
the MSPE of a possibly nonlinear function of the small area means h{9i),i = l,---,m. 
The proposed method may be contrasted with the existing methods, which require explicit 
analytical expressions for bias correction. The popular linearization method of bias correction 
proposed by PR can not be easily extended to nonlinear h and non-normal models. Further 
the PR approach is sensitive to the method of estimating model parameters in the sense 
that additional analytical work may be needed for each new estimation method. In the cases 
where exact analytical expressions are available, the simulation results indicate that the PR 
and the LM methods (are comparable and) have the best overall performance (in terms of 
MSEs) while the proposed method (LM2) fares reasonably well against these. In particular, 
LM is preferable to LM2 in such situations. As for comparison with the JK method in this 
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case, the LM2 method performs much better than the JK method in finite samples in terms 
of the co-efficient of variation. 

In the more comphcated examples, where exact analytical expressions for the MSPE are 
not available, the LM and the PR methods are not applicable, but the LM2 method and the 
JK method (with some suitable adaptation) are. In this case, the LM2 method seems to have 
a superior performance compared to the JK method in terms of overall accuracy. Further, 
because of the inherent limitations of the Jackknife method for estimating the variance of a 
non-smooth estimator of the model parameters (e.g., the sample median), the JK method 
may produce an inconsistent estimator of the MSPE (more precisely, of the variance type 
term M2i), while the bootstrap based LMl and LM2 methods would still work (cf. Ghosh et 
al. (1984)). From this point of view, the proposed method of MSPE estimation has a wider 
range of validity than the JK method. 

In this paper, we also prove that the proposed estimator of the MSPE attains the same 
level of asymptotic accuracy as the existing methods in correcting the bias of the simple 
MSPE estimator. We also report the results of a small simulation study and provide some 
guidelines for implementing the methodology in practice. In summary, the proposed method 
allows a user to routinely derive second order accurate, nonnegative estimates of the MSPE 
in small area estimation problems, without requiring any analytical work on the part of the 
user. 
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Appendix A: Proofs 

Let W = {1, 2, . . .}. In the proofs, we suppress dependence of various quantities on m 
unless there is a chance of confusion and write C, C(-) to denote generic positive constants 
that depend on their arguments (if any), but not on i G {1, . . . , m} or m. 
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Lemma 1: Let Xi, . . . , X„ {n > 1) be a collection of iid random variables with i?|Xi|^+'' < 
oo for some r] G (0, 1]. Let /i = EXi, X„ = n~^ ELi ^i and p = (E|Xi|i+'')^. Then 

E\Xn - /i| < 3pn''/(^+'') for all n>l. (A.l) 

Proof: If p = 0, then (A.l) holds trivially. Hence, suppose that p > 0. With c„ = pn^'^^'^^' , 
let Xi, = XJ{\X,\ < c„), X2i = X, - Xu, 1 < 2 < n, and iyfe„ = n-iEr=i(^fe* - 
EXk^), k = 1,2. Then, E|X„ - p| < E|iyi„| + E|iy2„| < {n-'E\Xn\^y^^ + 2E\X2i\ < 
{n~^cl-'iE\Xi\^+''iy/^ + 2E|Xi|i+''c^'? < Spri'i/^^+'il 

Lemma 2: For random vectors X and F on a common probability space with E\g{Y)\°' < oo 
for some g : R' ^ R and a e [1, oo), E\E{g(Y)\X} - ^(r)|" < 2"E|^(r)|". 

Proof: Follows from Holder's and conditional Jensen's inequalities. 

Proof of Theorem 1: (i) By (C.l)(ii) and Taylor's expansion, for some Uu,U2i G [—1, 1], 



E 



{Mh(5 + ZmCj) - Mu{5 - ZmCj)} - 2z„Mi?(^) 



= 2-'[zm]^E\Mlf'\6 + uuzme,) - Mif'\6 + M2.^™e,)| < C^if^^]^ (A.2) 

for alH = 1, . . . , m, -m > 1. Since E^ Mj*j((5o) = Mii{6o) for all 6o G A, by Lemmas 1 and 2, 

e\m*,{5i) - M^,C5i)\ < 3E{E^Uyf) - h{9f)\'^'m,''^^'^'^} 
< C{r])E{E^Jh{ef)\^+^mo'"/'^'^^^} < Cir])No^^^'^'''\ (A.3) 

where 6i = 6 + Zm^j- Using (A.3) and similar arguments for M*j((5 — -2m Cj), we get 

E\M^^\5) - {2z^)-^{Mu(5 + z^e,) - Mu{5 - z„,ej)}\ < i2zJ-'{Cir])No''^^'+''^} (A.4) 

uniformly in i = 1, . . . ,772, m > 1. Part (i) of the theorem now follows from (A.2)-(A.4). 

Next consider (ii). By arguments similar to (A.2), E\{Mii{6 + Zm^j) + Mii{6 — Zm^j) — 
2Mii(5)} — (zm^ M{1' (5)\ < C ■ {zjnY uniformly in i = 1, . . . ,772, m > 1. Also, using Lemma 
1, the linearity of M{f *{5) in MJ^ (•) and arguments similar to (A.3), one can show that 

E|M(p(5) - {zrnr^{Mu{6 + Zrnei)+Mu(6-Zrne,)-2zrnM^^{6)}\ < C{r^){zrn)-^N^"'^'^''^^ 

(A.5) 
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uniformly in i = 1, . . . ,m, m > 1. Hence, part (ii) holds for all j, r G {1, . . . , k} with j = r. 
Next Rx I < j ^ r < k. Define M^f'^'^^S) = (2[2„]2)-i[{Mii(5 + ZmCj^r) + Mu{5 - z^ej^r) - 
2Mu{6)} - [zm]HMii'^\6) + m[1'"\6)}]. By Taylor's expansion 

E\M[f'-'^\6) - Mil''\6)\{2[z^]') < C[zmf. (A.6) 

Now using (A.6) and arguments similar to (A. 5), one can complete the proof of (ii). 



Proof of Theorem 2: Note that E ^{S*^)-S = b{6) = b and E ^{V*^) = V. Hence, for any 
J G {1, . . . , A;}, E\h*U) - m^ < N,'E{E^^{5*\3) - 5U)f} < 2N,'E{pm(5-. 2) + ||5f } = 
0{Nq^). Similarly, by Lemma 1, E\V*{j^r) — V{j^r)\ is bounded above by 



E 



No 

' 1=1 



< C{v) [E{pUh 2 + 2r])+ pUS; 2 + 2r7)}iVo''^ + ^{pn^(^; 2) + Pm{S; 2)}n^'^'] , 
for any j,r E {!,..., k}, where d* = Nq^ Z]/=i ^*'(j)- Theorem 2 follows from these bounds. 

Lemma 3 : Suppose that condition (C.3) holds. Then, for any 7 G [0,1]), E\\b — bW^^"^ = 
o(m-(i+7)) and E\\V - V\\^+^ = o(m-(i+T)). 

Proof: Fix 7 G (0,7^). Note that mE\\6 - 6f < C[\\a^{6)f + ||S„(5)f ] < CG2{S) < 00. 
Hence, S ^ 6 in mean sqrare and therefore, by the equicontinuity condition, ||am(^) ~"Om('5)|| 
and ||S„j(5) — Sm(5)|| both converge to zero in probability under 6. Further, the sequence 
{G2{Sy^^} is uniformly integrable. Hence, by the (extended) Dominated Convergence The- 
orem, [_E||am('^) — cim{(^)\\^^'' + E\\T,m{S) — Sm (5) II "'"+'''] ^ as m ^ oo, proving the lemma. 

Proof of Theorem 3: First we show that 

max E\M*.C6*) - Mu{Si)\ + max E\M;.{6) - M2i{6)\ = o{m-^). (A.7) 

l<i<m l<i<m 

Consider the first term on the left side. By arguments similar to (A. 5), maxi<i<m E\M^^{S^ ) — 
Mu{6*)\ < C{r])N^'"''^^^"\ Next, write A* = {5* G A} n {\M^^^\6)\'^ < (1 + logm)2} and 
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Ai = {6i G A} n {\M[f{6)\'^ < (1 + logm)2}, l<i<m,m>l. Then, using (4.13), it can 
be shown that 

E\Mu{K) - Mum < E\Mu(5l) - Mu{5i)\I{A* n A,) + E\Mu{5){I{^) + Hl^T)}] 

+EMu{5i)imr n Ai) + EMuC51)I{A: n A'i) 
= Rii + R2i + Rsi + Rii, say. (A.8) 

By (C.l), (C.2) (with rj = 1), (C.3) and arguments similar to the proof of Theorem 1, one 
gets maxi<,<„E|Mi?*(5) - Mif{6)\^ = 0{[z^]^ + [zm]-^N^'), maxi<,<^E|Mjf^>(5) - 
M[f^\6)\'^ = 0{[zm]'^ + [zm]~^No^), and E\\bf + E\\Vf = 0{m-^). Now using the above 
bounds, it can be shown (cf. (A. 17), Lahiri et al. (2006)) that 

maxi<i<mRii < Cimaxi<i<rnE\\6* - Si\\I{A* n Ai) = o{m'^). (A. 9) 

Since |M|j* (5)| > Co, there exist ei,e2 G (0, oo) such that |M}f (a;)| > ci for all a; G A with 
||a; — 6\\ < t2- Hence, by (C.l), there exists a C = C[t\) G (0, oo) such that on the set 
{\\b-b\\ <e2}, Pi-5|| <C[||6|| + ||1/||] foralH = l,...,m, m> 1. Hence, for any e > 0, 
by (C.l) and (C.4), (cf. (A.18)-(A.19), Lahiri, et al. (2006)) 

max P(\b,-b\\ >e) <P(\b-b\\ > £3) + P(C[||6|| + ||V||] > e) = O(m-(^+^')),(A.10) 

l<j<m 

max P(\M^^(S)\ < (l + logm)-2) <2P(\\6-6\\ > C{ei,e2)) = o(m-(^+'')). (A.ll) 

l<i<.m V / V / \ / 

Hence, it follows that 

max PiA'r) = 0(m-(i+^')). (A. 12) 

l<i<m V ^ 

We now obtain a similar bound on P([A*]'^). Since 6 is an interior point of A, there exists a 
63 G (0, 00) such that {x : \\x-6\\ < 63} C A. Let A*. = {6* G A} and A^- = {\Mif*{6)\-^ < 
(1 + logm)^}. By (5.3), and (A.7)-(A.12), uniformly over i = 1, . . . ,m, 

p{[A*r) < PiAi'r n a;, n a,) + +p(a^) + p(A-) 

< P{\\5, - 5\\ > es/2) + 2es~'E\\5: - 5,||/(A;, n A,) + P(A^) 

+ [p{\Mti5) - M,?(5)| > ^ - (Y^i^) + P(|M«(5) - M,?(5)| > ^); 
= 0(m-(i+'') + (logm)4[^„ + (z„iVo'/')-i]). (A.13) 

26 



Now using (A.12), (A.13) and condition (C.l), with af = P(A^) +P{[A*Y), we have 
R2^ < C{a^ + CME\\S-Sry/'} = oim-'), 



Rsi < E 



Mu{5i) - Mu(5) imf n Ai) + R2^ = o{m-^), 



Rai < E\M^,{5l)-Mu{5)\I{A*nA^)+R2^ 



o[m 



(A.14) 
(A.15) 
(A.16) 



uniformly in i G {l,...,n} (cf. (A.23)-(A.25), Lahiri et al (2006)). By (A.8), (A.9), 
and (A. 14)- (A. 16), the first term on the left of (A. 7) is o{m^^). The upper bound on 
the other term on the left of (A. 7) follows from condition (C.l), the independence of the 
resampled vectors {yf, . . . , y*') for / = 1, . . . , A^o and the fact E c{Ci{yf; S*^) - ^i{yf', ^))^ = 
M2i{S). Hence (A. 7) is proved which, in turn, implies that maxi<j<m-E "^speNEw(A) ~ 
mspei^yiiPi) = o{m~^). Next define the preliminary titled estimator 6i for the LM method 
by using the bias and the variance estimators b = b{6) and V = V{6). Note that with this 
choice of b and V, the regularity conditions for the validity of Theorem 3 of LM follow from 
conditions (C.1)-(C.4) and Lemma 3 above. Hence, (5.4) follows from Theorem 3 of LM. 
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Appendix B 

In this section, the simulation results are presented into subclasses as per the request of 
a referee. For example, in model I and Model III, the small areas are grouped into 3 classes 
having eaual sampling variances, denoted as Gl, G2 and G3. Thus each group represent 5 
areas and summary results are presented for each group. But for model II, 3 representative 
areas are chosen, namely the areas for rii = 6, Ui = 16 and rii = 36. Though they are 
not group in a true sense, they are also represented as Gl, G2 and G3 in the tables for 
convenience. Note that, in this case the estimates represent only thsese selected three areas, 
not the averages. 

The Table lb represnts the simulated bias and MSPE. For model I, the third group has 
higher bias and vice versa for model III. For model II, G3, the highest sample size has lowest 
bias. Interms of MSPE, for all the models, Gl is the highest, althogh the results between 
the goups are not drastically different. Also the kernel based method and the closed form 
formulas (wherever applicable) performs equally. 

Table lb. Absolute relative bias (Ti) and empirical MSPE (T2) for the EBP. Results using 
the kernel based approximations (3.6) and (3.7) are reported within the parentheses. 



28 







Model I 


Model II 


Model III 


asurei 


3 Group 


Median 


Mean 




Median 


Mean 




Gl 


2.201 


2.087 


0.276 


— 


— 






(2.119) 


(2.005) 


(0.271) 


(1.821) 


(1.194) 


Ti 


G2 


1.804 


2.196 


0.197 


— 


— 






(2.030) 


(2.066) 


(0.199) 


(1.001) 


(1.034) 




G3 


2.476 


4.846 


0.156 


— 


— 






(2.631) 


(4.265) 


(0.155) 


(.840) 


(0.825) 




Gl 


0.456 


0.435 


0.015 


— 


— 






(0.468) 


(0.483) 


(0.019) 


(0.300) 


(0.298) 


T2 


G2 


0.372 


0.360 


0.013 


— 


— 






(0.375) 


(0.362) 


(0.012) 


(0.272) 


(0.282) 




G3 


0.234 


0.240 


0.003 


— 


— 






(0.244) 


(0.243) 


(0.003) 


(0.250) 


(0.245) 
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The relative bias and the coefficient of variations of the MSPE estimates are presented 
in Table 2b. The results for LMl and LM2 are based on Remark 3 modification. However, 
they are fairly close when (4.6) was used instead. For all the groups the JLW shows slightly 
higher bias and CV compared to others. LMl and PR/DL performs equally well both in 
terms of bias and CV, LM2 has little higher CV for model I. For model II, CV under JLW 
is higher than that under LM2. For model III, LM2 performs well for all the groups. For 
large sample size, the CV under JLW is small yet larger than other methods. 

Table 2b. Relative biases (T^) and coefficient of variations (T^^) for the bias corrected 
estimators of the MSPE. Entries for LMl and LM2 are based on Remark 3 modification. 
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Model I Model II Model III 

Method Measures Group Median Mean Median Mean 

PR/DL T3 



JK T. 



n 



LMl T, 



T. 



LM2 T, 



n 



Gl 


0.016 


0.090 


— 


— 


— 


G2 


0.008 


0.063 


— 


— 


— 


G3 


0.106 


0.084 


— 


— 


— 


Gl 


0.184 


0.252 


— 


— 


— 


G2 


0.151 


0.203 


— 


— 


— 


G3 


0.119 


0.113 


— 


— 


— 


Gl 


0.287 


0.243 


-0.190 


— 


— 


G2 


0.124 


0.152 


-0.083 


— 


— 


G3 


0.124 


0.173 


0.025 


— 


— 


Gl 


0.924 


0.705 


1.532 


— 


— 


G2 


0.379 


0.449 


0.752 


— 


— 


G3 


0.366 


0.419 


0.360 


— 


— 


Gl 


-0.000 


0.072 


— 


— 


— 


G2 


-0.017 


0.036 


— 


— 


— 


G3 


0.061 


0.041 


— 


— 


— 


Gl 


0.190 


0.246 


— 


— 


— 


G2 


0.163 


0.196 


— 


— 


— 


G3 


0.083 


0.095 


— 


— 


— 


Gl 


-0.093 


-0.018 


-0.148 


-0.005 


-0.000 


G2 


-0.102 


-0.039 


0.094 


0.102 


0.009 


G3 


0.003 


-0.016 


0.054 


0.152 


0.108 


Gl 


0.276 


0.300 


0.154 


0.414 


0.368 


G2 


0.263 


0.274 


0.746 


0.102 


0.009 


G3 


0.201 


0.202 


0.054 


0.309 


0.202 
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